Prediction of daily new COVID-19 cases ‐ Difficulties and possible solutions

Epidemiological compartmental models, such as SEIR (Susceptible, Exposed, Infectious, and Recovered) models, have been generally used in analyzing epidemiological data and forecasting the trajectory of transmission of infectious diseases such as COVID-19. Experience shows that accurately forecasting the trajectory of COVID-19 transmission curve is a big challenge for researchers in the field of epidemiological modeling because multiple unquantified factors can affect the trajectory of COVID-19 transmission. In the past years, we used a new compartmental model, l-i SEIR model, to analyze the COVID-19 transmission trend in the United States. Unlike the conventional SEIR model and the delayed SEIR model that use or partially use the approximation of temporal homogeneity, the l-i SEIR model takes into account chronological order of infected individuals in both latent (l) period and infectious (i) period, and thus improves the accuracy in forecasting the trajectory of transmission of infectious diseases, especially during periods of rapid rise or fall in the number of infections. This paper describes (1) how to use the new SEIR model (a mechanistic model) combined with fitting methods to simulate or predict trajectory of COVID-19 transmission, (2) how social interventions and new variants of COVID-19 significantly change COVID-19 transmission trends by changing transmission rate coefficient βn, the fraction of susceptible people (Sn/N), and the reinfection rate, (3) why accurately forecasting COVID-19 transmission trends is difficult, (4) what are the strategies that we have used to improve the forecast outcome and (5) what are some successful examples that we have obtained.


Introduction
The World Health Organization (WHO) has declared an end to the COVID-19 global health emergency on May 5, 2023 [1].The COVID-19 pandemic has greatly damaged the health of the people in the world.Over the past four and a half years, there were more than 7 million people who died from COVID-19 globally [2].COVID-19 is caused by SARS-CoV-2 virus that is prone to mutations and the generation of genetic variants [3].Since its first outbreak in 2019, SARS-CoV-2 has continually evolved, resulting in the emergence of several prominent variants, including Alpha, Beta, Delta, and Omicron that have gained more efficient transmission, severity, and immune evasion properties.The latest SARS-CoV-2 variant, Omicron, had the strongest breakthrough infectivity and re-infectivity compared to the previous SARS-CoV-2 variants [4][5][6].
Throughout the pandemic, researchers used mathematical models to analyze COVID-19 data for better understanding transmission patterns, monitoring disease severity, anticipating future epidemic outcomes [7] and justifying the adoption of intervention measures [8].Among these mathematical models, compartmental models describing the disease as a sequence of different stages encountered upon infection to recovery, such as SEIR (Susceptible-Exposed-Infectious-Recovered) model, have been generally adopted to forecast or simulate future transmission trajectories [9,10].These compartmental models provide a parsimonious (i.e., using few parameters) approach to understanding important behaviors of epidemic pathways.Experience has shown that such models generate robust results that strengthen their usefulness [11]; however, it has been recognized that forecasting COVID-19 transmission trajectories remains a big challenge to the mathematical modelers [7,9,11,12].The conventional compartmental models [13,14] assume that infected individuals in each related compartment have no temporal heterogeneity, so all infected individuals in a compartment have the same probability to transfer to their next compartment.However, the realty is different: individuals are usually infected on differential days with a chronological order, so on average, individuals infected earlier in one compartment will be transferred to their next compartment at an earlier time.To describe the effects of disease latency, time delays have been included in these compartmental models, such as SIR and SEIR models [15][16][17].However, these time delayed models did not completely solve the problem related to temporal heterogeneity of infected individuals, because the rate of the infectious individuals (dI(t)/dt) exiting from compartment I is still proportional to I(t) (see Eqn (1.1) in Hattaf's paper, Eqn (15) in Huang's paper, and Eqn (2.4) in Cooke's paper).In our recent paper about l-i SEIR model, we have demonstrated that the terms proportional to E(t) or I(t) are major contributors to calculation errors due to the assumption of temporal homogeneity underlying these two terms [18].The li SEIR model takes into account of temporal heterogeneity or the chronological order of infected individuals in both period l (latent period or compartment E) and period i (infectious period or compartment I) based on the first-in, first-out rule.It was demonstrated that, when calculating the transfer rate of infected individuals from one compartment of the SEIR model to the next compartment, the temporal homogeneity approximation in the conventional SEIR model leads to calculation errors that increase linearly with the rate of change in the number of infectious individuals.Despite the improvement in calculation accuracy of the SEIR model after taking into account of the chronological order of infected individuals in the model, multiple other factors (such as interventions on social distancing, face masks, vaccination, and the emergence of new, more contagious COVID-19 variants) may still affect the accuracy of prediction from compartmental models [7,9,11].Understanding how these factors affect forecast results is important to improving forecast accuracy.
In a review article, Holmdahl and Buckee [9] described forecasting models, mechanistic models and hybrid approaches in modeling studies of COVID-19 transmission.Forecasting models typically fit a line or curve to data and extrapolate from there.In contrast, mechanistic models, like SEIR model, mimic the way COVID-19 spreads and can be used to simulate future transmission scenarios under various assumptions.There are hybrid approaches, such as the one we will cover in this paper, fitting a curve calculated from the l-i SEIR model to reported COVID-19 data and extrapolating the calculated curve to forecast the trajectory of COVID-19 spread in the future.We will describe the difficulties that we have encountered in predicting transmission trajectories of COVID-19 from the l-i SEIR model and some strategies that we have used to overcome these difficulties.Because Omicron became the dominant SARS-CoV-2 strain in the U.S. since late December, 2021 [19], our data analysis mainly focuses on the data of COVID-19 transmission caused by omicron in the United States, covering the whole period from the early outbreak of COVID-19 Omicron in the US to May 5, 2023 when data of daily COVID-19 cases in the US were not updated anymore on websites.As the final result in this study, we demonstrated our accurate prediction of the trajectory of daily COVID-19 cases in the US, which was documented in the public website (Twitter, now known as X), over a nearly 3-month period from February 10, 2023, to May 5, 2023.By searching PubMed, we have not found a similar study like this.We hope that this paper will be helpful for researchers to share their experiences in predicting the spread of infectious diseases.

Methods
The transmission dynamics of Omicron-caused COVID-19 described by the l-i SEIR model in this study is demonstrated in Fig 1 .In this section, we present l-i SEIR model equations considering and ignoring Omicron-to-Omicron reinfection and describe the procedures for determining parameters and coefficients in these equations.

The l-i SEIR model with or without Omicron-to-Omicron reinfections
Because the latest SARS-CoV-2 variant, Omicron, has strong breakthrough infectivity and reinfectivity to the previous COVID-19 variants [4][5][6], we hypothesized that during the early stages of an Omicron-caused COVID-19 outbreak, most people who were not previously infected with Omicron were susceptible to Omicron infection.Furthermore, we hypothesized that the Omicron reinfection after Omicron-caused infections (or Omicron-to-Omicron reinfection) was ignorable in the first big wave of the Omicron-caused COVID-19 outbreak.This hypothesis was consistent with the later report that the mean time for Omicron-to-Omicron reinfections after the initial infection needs about 22 weeks (or 5 months) [20], which is significantly longer than the time frame (around 2 months from late December 2021 to late February 2022) of the first big wave (outbreak) of Omicron-caused COVID-19 transmission in the USA.If Omicron-to-Omicron reinfection can be ignorable, the l-i SEIR (Susceptible-Exposed-Infectious-Recovered) epidemic model for the Omicron-caused COVID-19 transmission can be described by the following recursive equations: To connect the calculated model variable with the daily new COVID-19 cases, we assume: Definitions of all variables, parameters, and coefficients in Equation 1 are listed in Table 1.In Eqn (1a), N is the number of susceptible people right before the infectious disease spreads out.If all people in the population are susceptible to the infectious agents before the infectious disease spreads out, N equals to the number of population P.However, if a portion of people has immunity to the infectious disease before the infectious disease spreads out, N is smaller than P.For the initial condition of Eqns (1a)-(1d), we assume: (a) S n = N and E n = I n = R n = 0 as n < 0; and (b) S 0 = N-1, E 0 = 1, I 0 = R 0 = 0. Eqns (1a)-(1d) were derived based on the following assumption: In the outbreak period of Omicron-caused COVID-19, change in S n is proportional to I n and proportional to S n /N, and Omicron-to-Omicron reinfection can be ignorable.Under this assumption, both S n /N and the number of exposed individuals entering compartment E per day b n I n S nÀ 1 N À � will gradually approach 0 in the later stages of COVID-19 spread if no effective public health interventions (such as wearing masks, social distancing, and quarantine) were implanted during the pandemic.The mathematical model expressed by Eqns (1a)-(1e) describes the transmission process of Omicron-caused COVID-19 without considering Omicron-to-Omicron reinfections.However, in reality, the Omicron-to-Omicron reinfection rate is not zero but a number that cannot be ignored in the later period of Omicron-caused COVID-19 spread although the Omicronto-Omicron reinfection rate is small.Our data analysis on Omicron-caused COVID-19 spread shows that taking account of Omicron-to-Omicron reinfection rate, the number of exposed individuals who enter latent period (or compartment E) per day can be expressed as β n l n b n , where b n is a non-zero constant.To simplify the calculation program of E n and I n in Eqns (1b) and (1c), we still use (S n -S n-1 ) to represent the number of exposed individuals entering compartment E per day when the Omicron-to-Omicron reinfection rate cannot be ignored, but Eqn (1a) is replaced by the following Equation:

Variables, parameters & coefficients
0 In Eqn (1a)', S n (�0) = S n if S n >0, and S n (�0) = 0 if S n �0.The second term in the square brackets consists of a reinfection rate coefficient b n (0�b n �1) and a weight factor (N-S n- 1 (�0))/N.When S n is close to N, the weight factor is close to 0 and the first term in the square brackets plays the main role.However, when S n is much smaller than N, the weight factor is close to 1 and the second term in the square brackets plays the main role.If b n = 0, it means no Omicron-to-Omicron reinfection.In this situation, S n can vary from N (no one is infected) to 0 (all susceptible people are infected).If Omicron-to-Omicron reinfection rate is non-negligible, then b n is greater than 0. In this situation, S n can vary from N (no one is infected) to a negative number.The negative number means that not only are all susceptible people infected, but also some of them are re-infected.

Estimation of parameters and coefficients in the model equations
After mid-December 2020, COVID-19 vaccines were given to people in the US.Since then, COVID-19 vaccines gradually became an important factor to affect the trajectory of COVID-19 transmission.In 2021, the COVID-19 alpha variant emerged in the USA and caused a transmission peak in mid-April; and then the delta variant emerged in the USA and caused a transmission peak in early September [21].In this situation, multiple factors including vaccination (affect S 0 ), breakthrough infection [5] (affect S n ), reinfection [4] (affect rate equations), and intervention measures (affect transmission rate coefficient β n ) were able to affect the trajectory of y n , making simulations or predictions of y n trajectory more complicated because the coexistence of these factors made it difficult to identify who were susceptible and who were immune in the US.This complicated situation changed when the Omicron variant of COVID-19 virus began to spread.The Omicron variant had the strongest breakthrough infectivity and re-infectivity compared to the other previous COVID-19 variants [4][5][6].Vaccine effectiveness to omicron, comparing to Delta variants, dropped from 0.52 to 0.38 for those who had had their second dose 180 days earlier or more [6].Considering that many people in the US have only received one dose or even have not received vaccines, the actual number of people with immunity to omicron variant may be less than 38% (0.38) of the US population (P = 330,000,000).Our simulations show that the transmission of omicron variant in the US can be treated as the transmission of a new infectious disease from the beginning by assuming that only a fraction ~0.25 of the US population has immunity to the Omicron original variant in the early period of Omicron-caused COVID-19 outbreak in the US.This indicates that N is ~75% of the population P (N�0.75P= 250000000).Here N = 0.75P is an estimated average number of susceptible individuals at the start of the COVID-19 outbreak in the United States due to the original Omicron variant, while the remaining 0.25P (or P-N) is the estimated average number of people in the United State who are immune to the original Omicron variant.Actual situation is more complicated: A portion of the people classified as susceptible may already have some immunity, albeit lower, to the original Omicron variant.Likewise, individuals classified as immune to the original variant of Omicron may not be 100% immune to the original variant of Omicron.
When we used the l-i SEIR model to simulate and predict the daily new COVID-19 cases, the first important thing being recognized was that the transmission rate coefficient β n of COVID-19 varies with time [22,23].The coefficient β n represents the efficiency of the interaction between I n and S n /N.During outbreak of COVID-19, including Omicron-caused COVID-19, some public health interventions (such as maintaining a relatively large social distance between people, wearing face masks, and staying at home) were generally used to reduce COVID-19 transmission rate.These interventions reduced the transmission rate coefficient β n by lowering the efficiency of the interaction between I n and S n /N.Thus, β n varies with time especially during COVID-19 outbreak.In studies of COVID-19 transmission, the time-dependent transmission rate coefficient has been also recognized by other researchers recently [24][25][26][27][28][29][30][31][32].To simulate the transmission process of Omicron variants, we first estimated the initial value of transmission rate coefficient β n from the reported number of daily new COVID-19 cases before Omicron started to spread out in the US by using the method described previously [18,33].The related computation program can be found in the worksheet "time-dependent rate" of the Excel file [34].This estimated initial value of β n (β n = 0.7) combining with other estimated or determined parameters and coefficients (such as l, i, α and N) were used for calculating/simulating Omicron-caused daily COVID-19 cases.
We previously demonstrated how to obtain the values of l, i and α of l-i AIR model from the daily new COVID-19 cases reported in early 2020 when the COVID-19 outbreak began in the US [23,33].The l-i SEIR model is another form of the l-i AIR model [18], and the two models can be converted to each other with the same set of parameters l, i and α [34].Thus we can use the same method for l-i AIR model described previously [33,35] to determine the parameters l, i and α for l-i SEIR model.Briefly, we first plotted logarithm of � y n (7-day average of daily reported new COVID-19 cases), log(� y n ), in the early period of COVID-19 outbreak (See Table 1 in [33]) vs date (or n), which formed a straight line with a slope k 0 = 0.1368.Then, we calculated y n from Eqn (1) of the l-i AIR model [33] or above Eqn (1) of l-i SEIR model for a given pair of parameters l and i assuming that β n = 1 and α = 1 under the condition that the number of the total infections is much less than N (or S n /N�1).Plot of logarithm of y n , log (y n ), vs n would also form a straight line for the given pair of l and i.In this way, we could obtain the slope k(l,i) of the straight line for any given pair of l and i (see Table 2 in [33] and the related calculation program in [35]).When a pair of l and i makes the slope k(l,i) to be closest to k 0 , this pair of l and i was chosen to be used in the l-i AIR or l-i SEIR model for simulating an epidemic curve of COVID-19.It can be seen that when l = 4 and i = 10, the slope of the plot of log(y n ) vs date is 0.1372 (or k(4,10) = 0.1372), which is closest to the slope k 0 = 0.1368 of the plot of log(� y n ) vs date.However, the intercepts of the two straight lines (log(y n ) vs date and log (� y n ) vs date) may have large difference.By selecting a suitable date for the first non-travelrelated COVID-19 case in the US and regulating the value of α, we could change the intercept of the plot of log(y n ) vs n.In this way, we could find a value of α, which minimizes the difference between the two straight lines (log(y n ) vs n and log(� y n ) vs n), by the least squares method.It was found that when α = 0.01453 and the first non-travel-related U.S. case (the first contagious person in the USA) was assumed to begin on February 6, 2020, which is 3 days earlier than the date that we estimated for New York City [23].These estimated first-case-starting dates are within the time range suggested [36,37].The procedure for determining l, i and α of l-i AIR model was previously described in detailed [33] and the related calculation programs in Excel can be found in Mendeley Data repository [35].
The coefficient α in l-i SEIR model is defined as the transient incidence rate of the infectious people, and α is related to the procedure used for confirming a COVID-19 case when we study COVID-19 transmission.In general, α may vary with time.We observed significant changes in α when analyzing the spread of COVID-19 in Wuhan in early 2020.This significant change in α is mainly caused by the use of some special interventions, such as a substantial increase in the number of viral tests and the use of 16 Fangcang shelter hospitals to admit a large number of COVID-19 infections [23].However, our data analysis on New York city, New York State and the United States show that α is near a constant, which is 0.01176 in New York City and New York State [23], and 0.01453 in the USA [18].Assuming that α is a constant in the USA, we calculated the cumulative number of COVID-19 infections, including asymptomatic infections, in the USA from late February, 2020 to September 30, 2020.The calculated number on September 30, 2020 is very close to the real number of infections (including asymptomatic COVID-19 infections) in the USA reported on September 30, 2020 [18], suggesting that the parameters used in the model and assuming α to be a constant in the USA are reasonable.

Results and discussion
The following data analysis mainly focuses on the data of COVID-19 transmission caused by omicron in the United States from the early outbreak of COVID-19 Omicron in the US (late 2021) to May 5, 2023, when data of daily COVID-19 cases in the US were not updated anymore on websites.

Predicting the peak height and the peak date of reported daily COVID-19 cases (� y n )
COVID-19 is highly contagious.Public health interventions are generally used to reduce the transmission rate coefficient β n during COVID-19 outbreak.As a result, β n may gradually decrease before the reported number of daily new COVID-19 cases (� y n ) reaches its peak.Because the quantitative relationship between these interventions and values of β n is unknown and transmission rates of COVID-19 in the early outbreak period are highly sensitive to β n , it is difficult to accurately predict the peak date (the date when � y n reaches its peak) and peak height of � y n with inaccurate values of β n .In the following, we will do a series of simulations to examine how an inaccurate β n affect the magnitude of errors in predicting the peak height and the peak date of � y n .Furthermore, we will explore how to predict � y n peak based on these simulations.
In the simulations, we assumed that we were in the early outbreak period of Omicroncaused COVID-19, where � y n was rising rapidly before reaching a peak and we hoped to use l-i SEIR model and the latest available � y n data at that time to predict height and date of � y n peak.Furthermore, we had: l = 4, i = 10 [18,23,33], α = 0.01453 [18,23], β 0 = 0.7 and N = 250,000,000.By regulating β n , we could fit the calculated y n to the latest reported � y n data.In this way, we could determine the value of β n on the latest date.For example, assuming that today is December 26, 2021 and that we have known all � y n as of December 25, 2021, we can determine all values of β n on or before December 25, 2021 by fitting the calculated y n from the Eqns (1a)-(1e) to the reported � y n as described previously [18,23,33].The value of β n determined on December 25, 2021, is listed in Table 2 and 2 and simulated corresponding trajectories of y n as shown in Fig 2A .In this way, we forecasted the height and date of � y n peak on different days before the real � y n peak appeared.The y n peak predicted on December 25, 2021 is 1.84 million, which has the largest error in comparison to the value of the reported peak � y n (0.81 million) on January 13 and 14 in the year 2022.As the prediction day (the day on which the prediction of y n peak is made) approaches to January 13, 2022, the predicted height of peak y n approaches to the actual reported height of peak � y n (Fig 2A ).Thus, although it is difficult to accurately predict the height of � y n peak at early stage of COVID-19 outbreak because of the continuously varied β n , the prediction accuracy will be significantly improved if the latest-determined β n is used when the prediction day approaches the date of the reported � y n peak.Usually, the height of y n peak predicted on a day, during the rising phase of the actual � y n peak and much earlier than the date of the actual � y n peak, may be significantly greater than the actual height of � y n peak; therefore, the height of � y n peak predicted in this way can be considered as an estimated upper limit of the height of the � y n peak.In contrast, simulations (Fig 2B ) show that the largest error in predicting the date of � y n peak occurs on the day that is about 15 days before the date of � y n peak, and that an earlier prediction date may not cause a larger error in predicting the date of the � y n peak.Thus, the date of � y n peak may be predicted within a limited (less than 5 days in Fig 2B) error with this l-i SEIR model during the early rising phase of the COVID-19 outbreak, two to three weeks before the date of � y n peak.
In the prediction of � y n peak described above, we assumed that the time-dependent transmission rate coefficient β n became a constant on and after the day that the prediction of � y n peak was made.In this way, we can predict an upper limit of the height of the � y n peak and predict the date of the � y n peak within a limited error.One could also propose a linear or non-linear technique to extrapolate future values of the time-dependent β n , which may improve the accuracy in predicting the height of � y n peak (for a short period depending on the prediction accuracy of β n ) [38].

Predicting the trajectory of � y n after � y n peak
After the reported number of daily new COVID-19 cases, � y n , passes its peak, β n may remain the same or even decrease a bit until it is confirmed that the peak has passed.Then β n will increase because the interventions for social distancing and wearing face masks will be gradually lifted.Furthermore, the new Omicron subvariants with greater infectivity may spread any day after � y n peak to increase � y n again.These unknown or undetermined factors make it almost impossible to make long-term prediction of the exact trajectory of � y n .However, since β n most likely reaches its minimum value around the � y n peak, if we use this minimum value of β n to predict changes in � y n in the near future, the simulated y n curve will be likely lower than the reported � y n curve.This enables us to predict the lower bound of � y n curve in the near future after the � y n peak.The lower bound of � y n curves (dashed line) in Fig 3 was obtained by assuming that β n = 0.16/day after January 22, 2022.In addition to calculating the lower bound of y n curve, we can also calculate an upper bound of the � y n curve by assuming that β n rapidly increases to 1 or a greater number in a short period of time (solid line in Fig 3).This period is chosen to be significantly shorter than the actual period needed to increase β n in the real world.In the calculations, we assumed that no new Omicron subvariants appear in this time period to affect � y n largely.As shown in Fig  [41,42].Among these sub-variants, BA.1 and BA.1.1 were dominated in the big peak of � y n as of mid-February 2022 [43], and then other sub-variants followed separately.Considering that the later Omicron sub-variants had larger infectivity, we assume that each new sub-variant mentioned above can affect COVID-19 transmission by enlarging the number of susceptible people N. To simulate the Omicron-caused changes in y n after mid-February 2022, we allowed N to increase on some selected dates (from N = 250,000,000 to N = 332,400,000) between the end of 2021 and early October 2022, while β n gradually increases to 1 as of mid-September 2022.In this way, the simulated y n can fit the reported � y n very well (Fig 4) as of Oct 23, 2022.It needs to be noted that, when N increases to 332400000, almost all of the population in the US has become susceptible to the highly infectious Omicron sub-variants.To predict y n after October 23, 2022, we let β n continuously increase to 3.5 before the end of November 2022, and remain at 3.5 after November 2022.The predicted y n (solid line) forms a plateau from late October 2022 to the end of November 2022, and then y n significantly decreases after early December, 2022, and y n drops to nearly 1000 cases/day by the end of January 2023.This predicted result was uploaded to Twitter in late October 2022 [44].The reported daily COVID-19 cases met the predicted results well until early December 2022 [45].Our simulation and prediction showed that after August 2022, especially after the y n plateau in early December 2022, increasing β n or emergence of more contagious Omicron variants would not push y n up.This implies that the herd immunity to omicron has been reached in the United Sates base on the l-i SEIR model.In the above l-i SEIR model, it was assumed that any individual infected by an Omicron sub-variant would not be reinfected by any other Omicron subvariants and any new COVID-19 variants.However, in reality, Omicron-infected individuals still have a chance to be reinfected by an Omicron subvariant, even though the reinfection chance is very low.Therefore, the infected people are not able to form a perfect herd immunity.As we have seen, the reported daily new COVID-19 cases after late October 2022 (blue dots) formed a plateau between late October 2022 and late November 2022, which agreed with the predicted curve very well.However, the reported daily new COVID-19 cases slightly increased in the period between December 2022 and January 2023 because of the social gatherings in the holiday seasons (Christmas and New Year); and a more contagious Omicron variant XBB.1.5 also appeared in this period [46].This deviation from the predicted curve based on l-i SEIR model implies that a small ratio of Omicron-infected people can be re-infected by Omicron sub-variants and that the Omicron-to-Omicron reinfection needs to be considered in the modelling.

Simulating and predicting the trajectory of � y n in the presence of reinfection of Omicron infections
In the above l-i SEIR model, the number of susceptible people S n varies between 0 and N, or 0�S n �N.If most of susceptible people have been infected, then S n is far smaller than N and the ratio S n /N is near zero.Therefore, the number of daily new exposed people, (S n -S n-1 ) in Eqn (1a), is also near 0. However, if the rate of reinfection of Omicron infected people is nonnegligible, (S n -S n-1 ) must not be near zero even if all susceptible people have been infected.Thus, we suggest that in the presence of non-negligible rate of reinfection, the ratio S n-1 /N in Eqn (1a) should be replaced by [S n-1 (�0)/N + b n (N-S n-1 (�0))/N] as shown in Eqn (1a)'.Based on Eqns (1a)' and (1b)-(1d), we simulated and predicted daily new COVID-19 cases on February 10, 2023 [47] assuming that b n = 0.03, and compared them with later reported data until May 5, 2023 (Fig 5) [48] when data of daily COVID-19 cases in the US were not updated

Summary
Based on the l-i SEIR model, the authors described difficulties and discussed possible solutions in forecasting the peak date and the peak height of daily new COVID-19 cases (� y n ) caused by Omicron, the trajectory of � y n after the � y n peak, and the trajectory of � y n after the herd immunity was reached in the presence or absence of Omicron-to-Omicron reinfection.Our simulations show that by using the β n determined from the latest reported � y n data, one may predict the date of � y n peak within a limited prediction error, and also predict an upper limit for the height of the � y n peak.It is possible to accurately predict the trajectory of y n after the � y n peak for a few weeks (up to 4 weeks from 1/22/2022-2/19/2022 as shown in Fig 3) with a constant β n .However, by calculating a lower limit and an upper limit of the y n curve, one may successfully predict the trace of � y n within the range between the lower limit and upper limit of the y n curve for more than 3 months (from 1/22/2022 to 4/28/2022 in Fig 3).The l-i SEIR model without considering Omicron-to-Omicron reinfection could not explain the remaining non-negligible number of daily new COVID-19 cases after the herd immunity was reached (S n /N�0), suggesting that the Omicron-to-Omicron reinfection should be taken into account in the model.The simulated y n curve based on the l-i SEIR model considering Omicron-to-Omicron reinfection can fit very well with the numbers of reported COVID-19 cases after the herd immunity has been reached, and the predicted y n curve is in good agreement with the number of daily new COVID-19 cases reported as of May 10, 2023, twelve weeks after the prediction of y n curve was made on February 10, 2023.

Definitionn
Number of days passed since the day (n = 0) on which the first person was exposed S n Number of remaining susceptible individuals who are able to contract the disease on day n E n Number of exposed individuals who are in the latent period before becoming infectious on day n I n Number of infectious individuals who are in the infectious period and are capable of transmitting the disease on day n R n Number of people who have recovered and developed immunity on day n β n The transmission rate coefficient on day n l The average time length of latent period i The average time length of infectious period c The sum of the average time length of latent period (l) and infectious period (i) N Total number of susceptible people right before the infectious disease spreads out α Transient incidence of the infectious people, which is a fraction between 0 and 1. y n Calculated number of the daily confirmed COVID-19 cases � y n Reported number of the daily confirmed COVID-19 cases P Population b n Final ratio of the remaining number of susceptible people to the total number of susceptible people (N) https://doi.org/10.1371/journal.pone.0307092.t001 other determined values of β n before December 25, 2021 are not listed.With these determined values of β n , we couldn't accurately predict when � y n would reach its peak and what would be the height of � y n peak because we didn't know the accurate values of β n after December 26, 2021.To make a prediction about the trajectory of � y n in the near future from December 26, 2021, we assumed that β n after December 26, 2021, was a constant, the same as the value determined on December 25, 2021.Thus, we could simulate a trajectory of y n from Eqns (1a)-(1e) to see the peak date and peak height of COVID-19 transmission wave as shown in Fig 2A (the green solid line, peaking on January 15, 2022, with a peak height of 1.84 million cases/day).Repeating this process, we could obtain values of β n on the later dates (from December 26, 2021 to January 9, 2022) as shown in Table

Fig 3 .
Fig 3.The simulated lower limit (dashed line) and upper limit (solid line) of � y � n .The reported � y � n (red dotted line) is on or between the simulated lower and upper limits of � y � n .https://doi.org/10.1371/journal.pone.0307092.g003

Fig 5 .
Fig 5. Simulated and predicted and reported number of Omicron-caused daily new COVID-19 cases in the United States after considering Omicron reinfection in the model.https://doi.org/10.1371/journal.pone.0307092.g005